Theoretical Analysis for Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability

نویسندگان

  • Jichiang Tsai
  • Sy-Yen Kuo
  • Yi-Min Wang
چکیده

Rollback-Dependency Trackability (RDT) is a property that states that all roll-back dependencies between local checkpoints are on-line trackable by using a transitive dependency vector. In this paper, we address three fundamental issues in the design of communication-induced checkpointing protocols that ensure RDT. First, we prove that the following intuition commonly assumed in the literature is in fact false: if a protocol forces a checkpoint only at a stronger condition then it must take at most as many forced checkpoints as a protocol based on a weaker condition. This result implies that the common approach of sharpening the checkpoint-inducing condition by piggybacking more control information on each message may not always yield a more eecient protocol. Next, we prove that there is no optimal on-line RDT protocol that takes fewer forced checkpoints than any other RDT protocol for all possible communication patterns. Finally, since comparing checkpoint-inducing conditions is not suucient for comparing protocol performance, we present some formal techniques for comparing the performance of several existing RDT protocols.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

More Properties of Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability

Rollback-Dependency Trackability (RDT) is a property stating that all rollback dependencies between local checkpoints are on-line trackable using a transitive dependency vector. In this paper, we introduce some properties of communication-induced checkpointing protocols possessing the RDT property. First, we demonstrate that wherever an RDT protocol detects a PCM-path in the checkpoint and comm...

متن کامل

RDT-Partner: An Efficient Checkpointing Protocol that Enforces Rollback-Dependency Trackability

Checkpoint patterns that enforce rollback-dependency trackability (RDT) have only on-line trackable checkpoint dependencies and allow efficient solutions to the determination of consistent global checkpoints. The design of RDT checkpointing protocols that are efficient both in terms of the number of forced checkpoints and in terms of the data structures propagated by the processes is a very int...

متن کامل

RDT-Partner: An Ef£cient Checkpointing Protocol that Enforces Rollback-Dependency Trackability

Checkpoint patterns that enforce rollback-dependency trackability (RDT) have only on-line trackable checkpoint dependencies and allow ef£cient solutions to the determination of consistent global checkpoints. The design of RDT checkpointing protocols that are ef£cient both in terms of the number of forced checkpoints and in terms of the data structures propagated by the processes is a very inter...

متن کامل

Rollback-Dependency Trackability: A Minimal Characterization and Its Protocol

Considering a checkpoint and communication pattern, the rollback-dependency trackability (RDT) property stipulates that there is no hidden dependency between local checkpoints. In other words, if there is a dependency between two checkpoints due to a noncausal sequence of messages (Z-path), then there exists a causal sequence of messages (C-path) that doubles the noncausal one and that establis...

متن کامل

Protocol for Coordinated Checkpointing using Smart Interval with Dual Coordinator

Introduction to Distributed System Design, Google Code University, http://code. google. com/edu/parallel/dsd-tutorial. html#Basics D. Manivannan, R. H. B. Netzer & M. Singhal, "Finding Consistent Global Checkpoints in a Distributed Computation", IEEE Trans. On Parallel & Distributed Systems, Vol. 8, No. 6, pp. 623-627 (June 1997) J. Tsai & S. Kuo, "Theoretical Analysis for Commun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Parallel Distrib. Syst.

دوره 9  شماره 

صفحات  -

تاریخ انتشار 1998